Enabling NVM for Data-Intensive Scientific Services
نویسندگان
چکیده
Specialized, transient data services are playing an increasingly prominent role in data-intensive scientific computing. These services offer flexible, on-demand pairing of applications with storage hardware using semantics that are optimized for the problem domain. Concurrent with this trend, upcoming scientific computing and big data systems will be deployed with emerging non-volatile memory (NVM) technology to achieve the highest possible price/productivity ratio. Clearly, therefore, we must develop techniques to facilitate the confluence of specialized data services and NVM technology. In this work we explore how to enable the composition of NVM resources within transient distributed services while still retaining their essential performance characteristics. Our approach involves eschewing the conventional shared file system model and instead projecting NVM devices as remote microservices that leverage user-level threads, remote procedure call (RPC) services, remote direct memory access (RDMA) enabled network transports, and persistent memory libraries in order to maximize performance. We describe a prototype system that incorporates these concepts, evaluate its performance for key workloads on an exemplar system, and discuss how the system can be leveraged as a component of future data-intensive architectures.
منابع مشابه
The Design and Implementation of a Non-Volatile Memory Database Management System
Thisdissertation explores the implications of emergent non-volatilememory (NVM) technologies for database management systems (DBMSs). The advent of NVM will fundamentally change the dichotomy between volatile memory and durable storage in DBMSs. These new NVM devices are almost as fast as DRAM, but all writes to it are potentially persistent even after power loss. Existing DBMSs are unable to t...
متن کاملPerformance Evaluation and Modeling of HPC I/O on Non-Volatile Memory
HPC applications pose high demands on I/O performance and storage capability. The emerging non-volatile memory (NVM) techniques offer low-latency, high bandwidth, and persistence for HPC applications. However, the existing I/O stack are designed and optimized based on an assumption of disk-based storage. To effectively use NVM, we must reexamine the existing high performance computing (HPC) I/O...
متن کاملManaging Hybrid Main Memories with a Page-Utility Driven Performance Model
Hybrid memory systems comprised of dynamic random access memory (DRAM) and non-volatile memory (NVM) have been proposed to exploit both the capacity advantage of NVM and the latency and dynamic energy advantages of DRAM. An important problem for such systems is how to place data between DRAM and NVM to improve system performance. In this paper, we devise the first mechanism, called UBM (page Ut...
متن کاملJoy Arulraj - Research Statement
Modern data-intensive applications apply statistical analysis algorithms onmassive databases to deliver qualitatively better results in many domains, including science, governance, and business. These applications require low latency, always-on, and cost-effective data management. This places increased demands on today’s database management systems (DBMSs) whose architectures are tailored for t...
متن کاملProtocols and Services for Distributed Data-Intensive Science
We describe work being performed in the Globus project to develop enabling protocols and services for distributed data-intensive science. These services include: * High-performance, secure data transfer protocols based on FTP, plus a range of libraries and tools that use these protocols * Replica catalog services supporting the creation and location of file replicas in distributed systems These...
متن کامل